Chess
Maia-2: A Unified Model for Human-AI Alignment in Chess
There are an increasing number of domains in which artificial intelligence (AI) systems both surpass human ability and accurately model human behavior. This introduces the possibility of algorithmically-informed teaching in these domains through more relatable AI partners and deeper insights into human decision-making. Critical to achieving this goal, however, is coherently modeling human behavior at various skill levels. Chess is an ideal model system for conducting research into this kind of human-AI alignment, with its rich history as a pivotal testbed for AI research, mature superhuman AI systems like AlphaZero, and precise measurements of skill via chess rating systems. Previous work in modeling human decision-making in chess uses completely independent models to capture human style at different skill levels, meaning they lack coherence in their ability to adapt to the full spectrum of human improvement and are ultimately limited in their effectiveness as AI partners and teaching tools. In this work, we propose a unified modeling approach for human-AI alignment in chess that coherently captures human style across different skill levels and directly captures how people improve. Recognizing the complex, non-linear nature of human learning, we introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions, enabling our model to be sensitive to evolving player skill. Our experimental results demonstrate that this unified framework significantly enhances the alignment between AI and human players across a diverse range of expertise levels, paving the way for deeper insights into human decision-making and AI-guided teaching tools. Our implementation is available here.
Appendix A Acknowledgement 17 B Different Chess Formats 17 B.1 Universal Chess Interface (UCI) 17 B.2 Standard Algebraic Notation (SAN) 17 B.3 Portable Game Notation (PGN)
We thank Jiacheng Liu for his work on collecting chess-related data and chess book list. B.1 Universal Chess Interface (UCI) The UCI format is widely used for communication between chess engines and user interfaces. It represents chess moves by combining the starting and ending squares of a piece, such as "e2e4" to indicate moving the pawn from e2 to e4. SAN (Standard Algebraic Notation) is a widely used notation system in the game of chess for recording and describing moves. It provides a standardized and concise representation of moves that is easily understood by chess players and enthusiasts. In SAN, each move is represented by two components: the piece abbreviation and the destination square. The piece abbreviation is a letter that represents the type of piece making the move, such as "K" for king, "Q" for queen, "R" for rook, "B" for bishop, "N" for knight, and no abbreviation for pawns. The destination square is denoted by a combination of a letter (a-h) representing the column and a number (1-8) representing the row on the chessboard. Additional symbols may be used to indicate specific move types. The symbol "+" is used to indicate a check, while "#" denotes a checkmate. Castling moves are represented by "O-O" for kingside castling and "O-O-O" for queenside castling. PGN is a widely adopted format for recording chess games. It includes not only the SAN moves but also additional information like player names, event details, and game results. PGN files are human-readable and can be easily shared and analyzed. FEN is a notation system used to describe the state of a chess game. It represents the positions of pieces on the chessboard, active color, castling rights, en passant targets, and the half-move and full-move counters. The active color is represented by "w" for white or "b" for black.
ChessGPT: Bridging Policy Learning and Language Modeling
When solving decision-making tasks, humans typically depend on information from two key sources: (1) Historical policy data, which provides interaction replay from the environment, and (2) Analytical insights in natural language form, exposing the invaluable thought process or strategic considerations. Despite this, the majority of preceding research focuses on only one source: they either use historical replay exclusively to directly learn policy or value functions, or engaged in language model training utilizing mere language corpus. In this paper, we argue that a powerful autonomous agent should cover both sources. Thus, we propose ChessGPT, a GPT model bridging policy learning and language modeling by integrating data from these two sources in Chess games. Specifically, we build a large-scale game and language dataset related to chess.
Maia-2: A Unified Model for Human-AI Alignment in Chess
There are an increasing number of domains in which artificial intelligence (AI) systems both surpass human ability and accurately model human behavior. This introduces the possibility of algorithmically-informed teaching in these domains through more relatable AI partners and deeper insights into human decision-making. Critical to achieving this goal, however, is coherently modeling human behavior at various skill levels. Chess is an ideal model system for conducting research into this kind of human-AI alignment, with its rich history as a pivotal testbed for AI research, mature superhuman AI systems like AlphaZero, and precise measurements of skill via chess rating systems. Previous work in modeling human decision-making in chess uses completely independent models to capture human style at different skill levels, meaning they lack coherence in their ability to adapt to the full spectrum of human improvement and are ultimately limited in their effectiveness as AI partners and teaching tools. In this work, we propose a unified modeling approach for human-AI alignment in chess that coherently captures human style across different skill levels and directly captures how people improve. Recognizing the complex, non-linear nature of human learning, we introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions, enabling our model to be sensitive to evolving player skill. Our experimental results demonstrate that this unified framework significantly enhances the alignment between AI and human players across a diverse range of expertise levels, paving the way for deeper insights into human decision-making and AI-guided teaching tools. Our implementation is available here.
Rotated Bitboards in FUSc# and Reinforcement Learning in Computer Chess and Beyond
There exist several techniques for representing the chess board inside the computer. In the first part of this paper, the concepts of the bitboard-representation and the advantages of (rotated) bitboards in move generation are explained. In order to illustrate those ideas practice, the concrete implementation of the move-generator in FUSc# is discussed and we explain a technique how to verify the move-generator with the "perft"-command. We show that the move-generator of FUSc# works 100% correct. The second part of this paper deals with reinforcement learning in computer chess (and beyond). We exemplify the progress that has been made in this field in the last 15-20 years by comparing the "state of the art" from 2002-2008, when FUSc# was developed, with recent innovations connected to "AlphaZero". We discuss how a "FUSc#-Zero" could be implemented and what would be necessary to reduce the number of training games necessary to achieve a good performance. This can be seen as a test case to the general prblem of improving "sample effciency" in reinforcement learning. In the final part, we move beyond computer chess, as the importance of sample effciency extends far beyond board games into a wide range of applications where data is costly, diffcult to obtain, or time consuming to generate. We review some application of the ideas developed in AlphaZero in other domains, i.e. the "other Alphas" like AlphaFold, AlphaTensor, AlphaGeometry and AlphaProof. We also discuss future research and the potential for such methods for ecological economic planning.
A Spymaster Sheikh Controls a 1.5 Trillion Fortune. He Wants to Use It to Dominate AI
For a while in the mid-2000s, a refrigerator-sized box in Abu Dhabi was considered the greatest chess player in the world. Its name was Hydra, and it was a small super-computer--a cabinet full of industrial-grade processors and specially designed chips, strung together with fiber-optic cables and jacked into the internet. At a time when chess was still the main gladiatorial arena for competition between humans and AI, Hydra and its exploits were briefly the stuff of legend. The New Yorker published a contemplative 5,000-word feature about its emergent creativity; WIRED declared Hydra "fearsome"; and chess publications covered its victories with the violence of wrestling commentary. Hydra, they wrote, was a "monster machine" that "slowly strangled" human grand masters.
Study of the Proper NNUE Dataset
Tan, Daniel, Medina, Neftali Watkinson
NNUE (Efficiently Updatable Neural Networks) has revolutionized chess engine development, with nearly all top engines adopting NNUE models to maintain competitive performance. A key challenge in NNUE training is the creation of high-quality datasets, particularly in complex domains like chess, where tactical and strategic evaluations are essential. However, methods for constructing effective datasets remain poorly understood and under-documented. In this paper, we propose an algorithm for generating and filtering datasets composed of "quiet" positions--positions that are stable and free from tactical volatility. Our approach provides a clear methodology for dataset creation, which can be replicated and generalized across various evaluation functions. Testing demonstrates significant improvements in engine performance, confirming the effectiveness of our method.
Complete Implementation of WXF Chinese Chess Rules
Tan, Daniel, Medina, Neftali Watkinson
Unlike repetitions in Western Chess where all repetitions are draws, repetitions in Chinese Chess could result in a win, draw, or loss depending on the kind of repetition being made by both players. One of the biggest hurdles facing Chinese Chess application development is a proper system for judging games correctly. This paper introduces a complete algorithm for ruling the WXF rules correctly in all 110 example cases found in the WXF manual. We introduce several novel optimizations for speeding up the repetition handling without compromising the program correctness. This algorithm is usable in engines, and we saw a total increase in playing strength by +10 point rating increase, or an increased 5% winrate when integrating this approach into our prototype engine.
Maia-2: A Unified Model for Human-AI Alignment in Chess
Tang, Zhenwei, Jiao, Difan, McIlroy-Young, Reid, Kleinberg, Jon, Sen, Siddhartha, Anderson, Ashton
There are an increasing number of domains in which artificial intelligence (AI) systems both surpass human ability and accurately model human behavior. This introduces the possibility of algorithmically-informed teaching in these domains through more relatable AI partners and deeper insights into human decision-making. Critical to achieving this goal, however, is coherently modeling human behavior at various skill levels. Chess is an ideal model system for conducting research into this kind of human-AI alignment, with its rich history as a pivotal testbed for AI research, mature superhuman AI systems like AlphaZero, and precise measurements of skill via chess rating systems. Previous work in modeling human decision-making in chess uses completely independent models to capture human style at different skill levels, meaning they lack coherence in their ability to adapt to the full spectrum of human improvement and are ultimately limited in their effectiveness as AI partners and teaching tools. In this work, we propose a unified modeling approach for human-AI alignment in chess that coherently captures human style across different skill levels and directly captures how people improve. Recognizing the complex, non-linear nature of human learning, we introduce a skill-aware attention mechanism to dynamically integrate players' strengths with encoded chess positions, enabling our model to be sensitive to evolving player skill. Our experimental results demonstrate that this unified framework significantly enhances the alignment between AI and human players across a diverse range of expertise levels, paving the way for deeper insights into human decision-making and AI-guided teaching tools. Our implementation is available here.